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Abstract 

We study the problem of sparse reconstruction from noisy undersampled measurements when the 
following two things are available. (1) We are given partial, and partly erroneous, knowledge of the 
signal's support, denoted by T. (2) We are also given an erroneous estimate of the signal values on T, 
denoted by (£l)t- In practice, both these may be available from available prior knowledge. Alternatively, 
in recursive reconstruction applications, like real-time dynamic MRI, one can use the support estimate 
and the signal value estimate from the previous time instant as T and (//)t- In this work, we introduce 
regularized modified-BPDN (reg-mod-BPDN) and obtain computable bounds on its reconstruction error. 
Reg-mod-BPDN tries to find the signal that is sparsest outside the set T, while being "close enough" 
to (P>)t on T and while satisfying the data constraint. Corresponding results for modified-BPDN and 
BPDN follow as direct corollaries. A second key contribution is an approach to obtain computable error 
bounds that hold without any sufficient conditions. This makes it easy to compare the bounds for the 
various approaches. Empirical reconstruction error comparisons with many existing approaches are also 
provided. 

Index Terms 

compressive sensing, sparse reconstruction, modified-CS, partially known support 

I. Introduction 

The goal of this work is to solve the sparse recovery problem O, O, (H, Q, 0. We try to reconstruct 
an m-length sparse vector, x, with support, N, from an n < m length noisy measurement vector, y, 
satisfying 

y = Ax + w (1) 

A part of this work was presented at IEEE International Conference on Acoustics, speech and signal processing (ICASSP), 
2010 Q]. This research was partially supported by NSF grants ECCS-0725849 and CCF-0917015. 
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when the following two things are available: (i) partial, and partly erroneous, knowledge of the signal's 
support, denoted by T; and (ii) an erroneous estimate of the signal values on T, denoted by (/2)t- In 
£[]), w is the measurement noise and A is the measurement matrix. For simplicity, in this work, we just 
refer to x as the signal and to A as the measurement matrix. However, in general, x is the sparsity basis 
vector (which is either the signal itself or some linear transform of the signal) and A = H<& where H 
is the measurement matrix and $ is the sparsity basis matrix. If «3> is the identity matrix, then x is the 
signal itself. 

The true support of the signal, N, can be rewritten as 

N = T U A \ A e (2) 

where 

A = N \ T and A e = T \ N (3) 

are the errors in the support estimate, T c is the complement set of T and \ is the set difference notation 
(N \ T = N n T c ). 

The signal estimate is assumed to be zero along T c , i.e. 

(A)t 

and the signal itself can be rewritten as 



(4) 



(x)nut = (v)nut + e (5) 
(x) N c=0 (6) 

where e denotes the error in the prior signal estimate. It is assumed that the error energy, 1 1 e 1 1 § , is small 
compared to the signal energy, 

In practical applications, T and /} may be available from prior knowledge. Alternatively, in applications 
requiring recursive reconstruction of (approximately) sparse signal or image sequences, with slow time- 
varying sparsity patterns and slow changing signal values, one can use the support estimate and the signal 
value estimate from the previous time instant as the "prior knowledge". A key domain where this problem 
occurs is in fast (recursive) dynamic MRI reconstruction from highly undersampled measurements. In 
MRI, we typically assume that the images are wavelet sparse. We show slow support and signal value 
change for two medical image sequences in Fig. [TJ From the figure, we can see the maximum support 
changes for both sequences are less than 2% of the support size and almost all signal values' changes 
are less than 0.16% of the signal energy. Slow signal value change also implies that a signal value is 
small before it gets removed from the support. 

This work has the following contributions. 
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(i) a larynx (vocal tract) image sequence 



(ii) cardiac image sequence 
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(ii) support removals, 
(b) 
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Fig. 1. In (a), we show two medical image sequences (a cardiac and a larynx sequence). In (b), xt is the two-level 
Daubechies-4 2D discrete wavelet transform (DWT) of the cardiac or the larynx image at time t and the set Nt 
is its 99% energy support (the smallest set containing 99% of the vector's energy). Its size, |JVt| varied between 
4121-4183 (« 0.07m) for larynx and between 1108-1127 (~ 0.06m) for cardiac. Notice that all support changes 
are less than 2% of the support size and almost all signal values changes are less than 4% of \\ (xt)N t \[2- 



1) We introduce regularized modified-BPDN (reg-mod-BPDN) and obtain a computable bound on its 
reconstruction error using the approach motivated by 0. Reg-mod-BPDN solves 

min l\\b T A\i + \\\y - Ab\\l + \\\\b T - ( 7 ) 

b I A 

i.e. it tries to find the signal that is sparsest outside the set T, while being "close enough" to (It on 
T, and while satisfying the data constraint. Reg-mod-BPDN uses the fact that T is a good estimate 
of the true support, N, and that (It is a good estimate of xt- In particular, for i £ A e , this implies 
that \ fii\ is close to zero (since X{ = for i G A e ). 

2) Our second key contribution is to show how to use the reconstruction error bound result to obtain 
another computable bound that holds without any sufficient conditions and is tighter. This allows 
easy bound comparisons of the various approaches. A similar result for mod-BPDN and BPDN 
follows as a direct corollary. 

3) Extensive reconstruction error comparisons with these and many other existing approaches are also 
shown. 

A. Notations and Problem Definition 

For any set T and vector b, bx denotes a sub-vector containing the elements of b with indices in T. 
\\b\\k refers to the £k norm of the vector b. Also, ||6||o counts the number of nonzero elements of b. 
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The notation T c denotes the set complement of T, i.e., T c = {i G [1, ...,m],i ^ T}. is the empty 
set. 

We use ' for transpose. For the matrix A, Ax denotes the sub-matrix containing the columns of A 
with indices in T. The matrix norm \\A\\ P , is defined as \\A\\ p = max^o ^pjp-- It is an identity matrix 
on the set of rows and columns indexed by elements in T. Ot,s is a zero matrix on the set of rows and 
columns indexed by elements in T and 5 respectively. 

The notation VL(6) denotes the gradient of function L(b) with respect to b. 

When we say b is supported on TU S we mean that the support of b (set of indices where b is nonzero) 
is a subset of T U S. 

Our goal is to reconstruct a sparse vector, x, with support, N, from the noisy measurement vector, y 
satisfying (OQ). We assume partial knowledge of the support, denoted by T, and of the signal estimate on 
T, denoted by (£i)t- The support estimate may contain errors - misses, A, and extras, A e , defined in 
©. The signal estimate, ft, is assumed to be zero along T c , i.e it satisfies © and the signal, x, satisfies 
©. 

B. Related Work 

The sparse reconstruction problem, without using any support or signal value knowledge, has been 
studied for a long time 0, 0, |@), 0, 0. It tries to find the sparsest signal among all signals that 
satisfy the data constraint, i.e. it solves mint, \\b\\o s.t. y = A/3. This brute-force search has exponential 
complexity. One class of practical approaches to solve this involves replacing ||6||o by ||6||i which is 
the closest norm to £q that makes the problem convex (basis pursuit) Q. For noisy measurements, the 
data constraint becomes an inequality constraint. However, this assumes that the noise is bounded and 
the noise bound is available. In practical applications where this may not be available, one can use the 
Lagrangian version which solves 



This is called basis pursuit denoising (BPDN) /El/- Since this solves an unconstrained optimization 
problem, it is also faster (needed for large problems). The error bound of BPDN was obtained in |[3l . 
Error bounds for the constrained version of BPDN were obtained in (71 , (H. 

The problem of sparse reconstruction with partial support knowledge was introduced in our work |9l , 
ifTOlk and also in parallel in Khajehnejad et al ifTTTl and in vonBorries et al Ifl2ll . In (9], ifTOl , we proposed 
an approach called modified-CS which tries to find the signal that is sparsest outside the set T and satisfies 
the data constraint. We obtained exact reconstruction conditions for it by using the restricted isometry 
approach [13]. When measurements are noisy, for the same reasons as above, one can use the Lagrangian 
version: 




(8) 



min "f\\bTc\\i + -\\V ~ Ab \\l 



b 



(9) 
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We call this modified-BPDN (mod-BPDN). Its error was bounded in the conference version of this work 
[03, while the error of its constrained version was bounded in Jacques lfl4ll . 

In inn . Khajehnejad et al assumed a probabilistic support prior and proposed a weighted l\ solution. 
They also obtained exact reconstruction thresholds for weighted l\ by using the overall approach of 
Donoho lfT5l . In Fig. [3l we show comparisons with the noisy Lagrangian version of weighted i\ which 
solves: 

min 7 ||6 re || 1+7 '||6 r || 1 + i||y-A6||l (10) 

Our earlier work on Least Squares CS-residual (LS-CS) and Kalman Filtered CS-residual (KF-CS) 
|[T6l . ifTTl can also be interpreted as a possible solution for the current problem, although it was proposed 
in the context of recursive reconstruction of sparse signal sequences. 

Reg-mod-BPDN may also be interpreted as a Bayesian CS or a model-based CS approach. Recent 
work in this area includes d, ED, El, ED, (221, HO, B4H . 

C. Some Related Approaches** 

Before going further, we discuss below a few approaches that are related to, but different from reg- 
mod-BPDN, and we argue when and why these will be worse than reg-mod-BPDN. This section may be 
skipped on a quick reading. We show comparisons with all these in Fig. [3] 

The first is what can be called CS-residual or CS-diff which computes 

x = p + b , where b solves 

min 7 \\b\\ 1 + h\y-Ajl-Ab\\ 2 2 (11) 

b Z 

This has the following limitation. It does not use the fact that when T is an accurate estimate of the true 
support, (x)t<= is much more sparse compared with the full (x — p). The exception is if the signal value 
prior is so strong that (x — fi) is zero (or very small) on all or a part of T. 

CS-residual is also related to LS-CS and KF-CS. LS-CS solves (TTTb but with px being the LS estimate 
computed assuming that the signal is supported on T and with (p)t<= = 0. For a static problem, KF-CS 
can be interpreted as computing the regularized LS estimate on T and using that as fix- LS-CS and 
KF-CS also have a limitation similar to CS-residual. 

Another seemingly related approach is what can be called CS-mod-residual. It computes 

xt = Pt-, = be, where b c solves 

min -\\y - A T p, T - A T .b c \\\ + 7 ||6 c ||i (12) 

where b c stands for (6)^. This is solving a sparse recovery problem on T c , i.e. it is implicitly assuming 
that xt is either equal to £it or ver y close to it. Thus, this also works only when the signal value prior 
is very strong. 
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Both CS -residual and CS -mod-residual can be interpreted as extensions of BPDN, and [3; Theorem 8] 
can be used to bound their error. In either case, the bound will contain terms proportional to ||(xt — /*t)||2 
and as a result, it will be large when the prior is not goocfl This is also seen from our simulation 
experiments shown in Fig. [3] where we provide comparisons for the case of good signal value prior 
(0.1% error in initial signal estimate) and bad signal value prior (10% error in initial signal estimate). 
We vary support errors from 5% to 20% misses, while keeping the extras fixed at 10%. 

Reg-mod-BPDN can also be confused with modified-CS-residual which computes |[25l 

x = ft + b, where b solves 

min _|| 1/ _,4£-A&||! + 7||& r .||i (13) 

b Z 

This is indeed related to reg-mod-BPDN and in fact this inspired it. We studied this empirically in |[25l . 
However, one cannot get good error bounds for it in any easy fashion. Notice that the minimization is 
over the entire vector b, while the l\ cost is only on fryc. 

As pointed out by anonymous reviewer, one may consider solving the following variant of reg-mod- 
BPDN (we call this reg-mod-BPDN -var): 

min 7||&r»||i + t\\v ~ + -A||6 — (14) 

b LI 

Since jl is supported on T, the regularization term can be rewritten as A||&— = A||&r — /^tIII+AII^T III- 
Thus, in addition to the £\ norm cost on bx^ imposed by the first term, this last term is also imposing 
an £2 norm cost on it. If A is large enough, the £2 norm cost will encourage the energy of the solution 
to be spread out on T c , thus causing it to be less sparse. Since the true x is very sparse on T c (|A| 
is small compared to the support size also), we will end up with a larger recovery erroiB We show an 
example in Fig. |2] Notice that the reg-mod-BPDN-var solution has many (about 7) more extras on T° 
than the reg-mod-BPDN solution and thus has a larger total error. Also, see Fig. Ha). However, if we 
compare the two approaches for compressible signal sequences, e.g. the larynx sequence, it is difficult 
to say which will be better [see Fig. 0). 

Finally, one may solve the following {can call it reg-BPDN) 

min 7||&||i + ^\\y~Abf 2 + -AH6-AII2 (15) 

b A Z 

This has two limitations. (1) Like CS-residual, this also does not use the fact that when T is an accurate 
estimate of the true support, (x)t<= is much more sparse compared with the full (x — ft). (2) Its last term 
is the same as that of reg-mod-BPDN-var which also causes the same problem as above. 

'in either case, one can assume that (x — jl) is supported on A and the "noise" is w + At(xt — At)- Thus, CS-residual 
error can be bounded by C(A, A)(||to||2 + ||^4t(3;t — Ar) ||a) while CS-mod-residual error can be bounded by \\xt — AHI2 + 
C(A T c,A)(\\w\\ 2 + ||A T (a:T-AT)|[a). 

2 In the limit if a/A/2 is much larger than 7, we may get a completely non-sparse solution. 



August 5, 2011 



DRAFT 



7 



0.5 







-0.5 























Original Signal 
^regmodBPDN 
^TregmodBPDN-var 
^CS-residual 










■ i 










< 











5 10 15 20 25 











— ' Original S 
-* regmodB 
-v regmodB 
-a CS-resid 

fi I 


ignal 
3 DN 

=DN-var " 
ual 

I ' 


I 


] J 


IT 



50 100 150 200 250 



(a) Comparisons on TV (b) Comparisons on iV c 

Fig. 2. We plot the true 256-length signal, x, and its reconstructions using reg-mod-BPDN, reg-mod-BPDN-var 
and CS-residual. In (a), the large leftmost elements are those on N n T followed by the two smaller ones on A. In 
(b), we plot the reconstructions on 7Y C . The first two indices correspond to the set A e and the rest are elements of 
(N U T) c . In (b), the original signal, x, is all zeros, where as the reconstructions have some error. We are showing 
one realization of the case of Fig. [3ja). 



D. Paper Organization 

We introduce reg-mod-BPDN in Sec. II. We obtain computable bounds on its reconstruction error in 
Sec. III. The simultaneous comparison of upper bounds of multiple approaches becomes difficult because 
their results hold under different sufficient conditions. In Sec. IV, we address this issue by showing how to 
obtain a tighter error bound that also holds without any sufficient conditions and is still computable. In both 
sections, the bounds for mod-BPDN and BPDN follow as direct corollaries. In Sec V, the above result is 
used for easy numerical comparisons between the upper bounds of various approaches - reg-mod-BPDN, 
mod-BPDN, BPDN and LS-CS and for numerically evaluating tightness of the bounds with both Gaussian 
measurements and partial Fourier measurements. We also provide reconstruction error comparisons with 
CS-residual, LS-CS, KF-CS, CS-mod-residual, mod-CS -residual and reg-mod-BPDN-var, as well as with 
weighted l\, mod-BPDN and BPDN for (a) static sparse recovery from random-Gaussian measurements; 
and for (b) recovering a larynx image sequence from simulated MRI measurements. Conclusions are 
given in Sec. VI. 

II. Regularized Modified-BPDN (Reg-mod-BPDN) 

Consider the sparse recovery problem when partial support knowledge is available. As explained earlier, 
one can solve mod-BPDN given in (|9]). When the support estimate is accurate, i.e. |A| and | A e | are small, 
mod-BPDN provides accurate recovery with fewer measurements than what BPDN needs. However, it 
puts no cost on 6t except the cost imposed by the data term. Thus, when very few measurements are 
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available or when the noise is large, br can become larger than required (in order to reduce the data 
term). A similar, though lesser, bias will occur with weighted £± also when 7' < 7. To address this, when 
reliable prior signal value knowledge is available, we can instead solve 



which we call reg-mod-BPDN. Its solution, denoted by x, serves as the reconstruction of the unknown 
signal, x. Notice that the first term helps to find the solution that is sparsest outside T, the second term 
imposes the data constraint while the third term imposes closeness to fi along T. 

Mod-BPDN is the special case of ( fT6l) when A = 0. BPDN is also a special case with A = and 
T = (so that A = N). 

A. Limitations and Assumptions 

A key limitation of adding the regularizing term, A||&t — i s as follows. It encourages the solution 
to be close to {£i)a c which is not zero. As a result, (x)^ will also not be zero (except if A is very 
small) even though (x)a c = 0. Thus, even in the noise-free case, reg-mod-BPDN will not achieve exact 
reconstruction. In both noise-free and noisy cases, if (fi)A a is large, {x)a c being close to (p)a c can 
result in large error. Thus, we need the assumption that (/x)a c is small. 

For the reason above, when we estimate the support of x, we need to use a nonzero threshold, i.e. 
compute 



with a p > 0. 

To get a small error reconstruction, reg-mod-BPDN requires the following: 

1) T is a good estimate of the true signal's support, N, i.e. |A| and |A e | are small compared to \N\; 
and 

2) fix is a good estimate of xt- For i G A e , this implies that is close to zero (since Xi = for 
i G Ae). 

3) For accurate (exact) support estimation, we also need that most (all) nonzero elements of x are 
larger than maxj S A e 

The smallest nonzero elements of x are usually on the set A. In this case, the third assumption is 
equivalent to requiring that most (all) elements of xa are larger than maxj £ A c \fii\- 

B. Dynamic Reg-Mod-BPDN for Recursive Recovery 

An important application of reg-mod-BPDN is for recursively reconstructing a time sequence of sparse 
signals from undersampled measurements, e.g. for dynamic MRI. To do this, at time t we solve (fT6l ) with 
T = Nt-i, where Nt-i is the support estimate of the previous reconstruction, xt-i, and ((j,)t = {xt-i)r 




(16) 



N = {i : \xi\ > p} 



(17) 
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and (/u)t<= = 0. At the initial time, t = 0, we can either initialize with BPDN, or with mod-BPDN using T 
from prior knowledge, e.g. for wavelet sparse images, T could be the set of indices of the approximation 
coefficients. We summarize the stepwise dynamic reg-mod-BPDN approach in Algorithm Q] Notice that 
at t = 0, one may need more measurements since the prior knowledge of T may not be very accurate. 
Hence, we use yo = AqXq + wq where Aq is an no x m measurement matrix with no > n. 

In Algorithm [Q we should reiterate that for support estimation, we need to use a threshold p > 0. 
The threshold should be large enough so that most elements of A e>t := T \ N t = Nt-i \ Nt do not get 
detected into the support. 

To address an important concern of an anonymous reviewer, we briefly explain here why reg-mod- 
BPDN will be stable by describing our result from ll26l . By "stable", we mean that the reconstruction 
error and the support estimation errors remain bounded by a time-invariant and small value at all times. 
In |[26l . we showed the following for the constrained version of reg-mod-BPDN (but the same ideas 
apply to reg-mod-BPDN also). If (i) p is large enough (so that N t does not falsely detect any element 
that got removed from N t ); (ii) the newly added elements to the current support, Nt, either get added 
at a large enough value to get detected immediately, or within a finite delay their magnitude becomes 
large enough to get detected; and (iii) the matrix A satisfies certain restricted isometry property (RIP) 
|[T3l conditions (for a given support size and support change size); then reg-mod-BPDN is stable. 

Algorithm 1 Dynamic Reg-mod-BPDN 

At t = 0, compute xq as the solution of min;, 7||(6)r°||i + ^Wuo — wnere T is either empty 

or is available from prior knowledge. Compute Nq = {i G [1, m] : |(xn)i| > a }- Set T ^— Nq and 
(p)t ^— (xq)t For t > 0, do 

1) Reg-Mod-BPDN. Let T = N t -i and let (it = {x t -i)T- Compute x t as the solution of (fT6l ). 

2) Estimate Support. N t = {i £ [1, ...,?n] : > p}. 

3) Output the reconstruction xp 

Feedback Nt and x t ; increment t, and go to stepQ] 



III. Bounding the Reconstruction Error 

In this section, we bound the reconstruction error of reg-mod-BPDN. Since mod-BPDN and BPDN are 
special cases, their results follow as direct corollaries. The result for BPDN is the same as 0] Theorem 
8]. In Sec. III-A, we define the terms needed to state our result. In III-B we state our result and discuss 
its implications. In III-C, we give the proof outline. 

A. Definitions 

We begin by defining the function that we want to minimize as 

LO^In^+TllMi (18) 



August 5, 2011 



DRAFT 



10 



where 



Ll {b)± l -\\y-Ab\\l+ l -\\\b T 



Aril! 



(19) 



contains the two £2 norm terms (data fidelity term and the regularization term). If we constrain b to be 
supported on T U S for some S C T c , then the minimizer of L\{b) will be the regularized least squares 
(LS) estimator obtained when we put a weight A on \\br — /trill an d a weight zero on \\bs — fis Ill- 
Let S be a given subset of A. Next, we define three matrices which will be frequently used in our 
results. Let 

Os,T Os,S 

M T ,x = I - A T (A T 'A T + AIt)" 1 At 



Qt,x(S) = Atus' 'Atus + A 



(20) 



P T> x(S)±(A s 'M T; xAc 



(21) 
(22) 



where It is a |T| x |T| identity matrix and 0t,s> Os.T, ®s,s ar e all zeros matrices with sizes |T| x |S|, 
|5| x \T\ and |5| x |5|. 

Assumption 1: We assume that Qt,\{A) is invertible. This implies that, for any SCA, the functions 
L(b) and L\{b) are strictly convex over the set of all vectors supported on T U S. 

Proposition 1: When A > 0, Qt,x{S) is invertible if A s has full rank. When A = (mod-BPDN), 
this will hold if Atus has full rank. 
The proof is easy and is given in Appendix |A) 

Let 5C A. Consider minimizing L(b) over b supported on TU S. When &(tu5) c = an ^ Assumption 
Q] holds, L(bTus) is strictly convex and thus has a unique minimizer. The same holds for Li(bTus)- 
Define their respective unique minimizers as 



d-T,\{S) = argmin L(b) subject to bi T uS)" 

b 

C T A {£>) — ar g min Li(b) subject to bt T uS) c 

b 







(23) 
(24) 



As explained earlier, ct,x(S) is the regularized LS estimate of x when assuming that x is supported on 
T U S and with the weights mentioned earlier. It is easy to see that 

A/tr 



0> 



[ct,a(5 , )]tu5 = Qt,a(5') 1 |^4rus / y + 

[ct,a(-S')](tu5)<= =° 
In a fashion similar to (3), define 



Si?C TjA (5)^l- max ||f!r,A(5)A s 'M TiA ^,||i 



(25) 



(26) 



This is different from the ERC of {3J but simplifies to it when T = 0, S = N and A = 0. In O, the 
ERC, which in our notation is ERC® q(N), being strictly positive, along with 7 approaching zero, ensured 
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exact recovery of BPDN in the noise-free case. Hence, in 0, ERC was an acronym for Exact Recovery 
Coefficient. In this work, the same holds for mod-BPDN. If ERCt,o{A) > 0, the solution of mod-BPDN 
approaches the true x as 7 approaches zero. We explain this further in Remark |2] below. However, no 
similar claim can be made for reg-mod-BPDN. On the other hand, for the reconstruction error bounds, 
ERC serves the exact same purpose for reg-mod-BPDN as it does for BPDN in O: ERCt a (A) > 
and 7 greater than a certain lower bound ensures that the reg-mod-BPDN (or mod-BPDN) error can be 
bounded by modifying the approach of 

B. Reconstruction error bound 

The reconstruction error can be bounded as follows. 
Theorem 1: If Qt,a(A) is invertible, ERC T ,x(A) > and 

7 - 7 ^ (A) = erCtAA) (27) 

then, 

1) L(b) has a unique minimizer, x. 

2) The minimizer, x, is equal to cIt,\(A), and thus is supported on TU A. 

3) Its error can be bounded as 



\\x - x\\ 2 < 7V|A|/ 1 (A) + \f 2 (A)\\x T - /trlla 
+/ 3 (A)H|2 

where 

/i(A) 4 



'\\(A T 'A T + AIt)" 1 A T 'A A P T>X (A)\\ 2 2 + ||Pt,a(A)|||, 
/ 2 (A)^||g T , A (A)- 1 || 2 , 

/ 3 (A) 4 \\Q T , x (Ay l A TuA >\\ 2 , (28) 

Pt,a(A) is defined in $22% and Qt,a(A) in d20l) . 

Corollary 1 (corollaries for mod-BPDN and BPDN): The result for mod-BPDN follows by setting 
A = in Theorem Q] The result for BPDN follows by setting A = 0, T = (and so A = N). 
This result is the same as (3j Theorem 8]. 

Remark 1 (smallest j): Notice that the error bound above is an increasing function of 7. Thus 7 = 
7^ A (A) gives the smallest bound. 

In words, Theorem 1 says that, if Qt,x{A) is invertible, ERCt : x{A) is positive, and 7 is large 
enough (larger than 7*), then L(b) has a unique minimizer, x, and x is supported on T U A = N U A e . 
This means that the only wrong elements that can possibly be part of the support of x are elements 
of A e . Moreover, the error between x and the true x is bounded by a value that is small as long as 
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the noise, ||iu||2, is small, the prior term, \\xt — At||2> is small and 7^ (A) is small. By rewriting 
y — Act,\(A) = A(x — or^(A)) + w and using Lemma 2 (given in the Appendix) one can upper bound 
7* by terms that depend on \\w\\2 and \\xj> — p*r\\2- Thus, as long as these two are small, the bound is 
small. 

As shown in Proposition 1, Qt \(A) is invertible if A > and A<\ is full rank or if AtuA is full rank. 
Next, we use the idea of (3l Corollary 10] to show that ERCt,q(&) is an Exact Recovery Coefficient 
for mod-BPDN. 

Remark 2 (ERC and exact recovery of mod-BPDN): For mod-BPDN, cr,o(A) is the LS estimate when 
x is supported on T U A. Using (|25T ). CD, and the fact that x is supported on N C T U A, it is easy to 
see that in the noise-free (w = 0) case, ct,o(A) = xtuA- Hence the numerator of 7^ (A) will be zero. 
Thus, using Theorem 1, if ERCt,o(A) > 0, the mod-BPDN error satisfies ||x — x\\2 < 7y|A[/i(A). 
Thus the mod-BPDN solution, x, will approach the true x as 7 approaches zero. Moreover, as long as 
7 < |^'| , at least the support of x will equal the true support, 7V£, 

We show a numerical comparison of the results of reg-mod-BPDN, mod-BPDN and BPDN in Table 
I (simulation details given in Sec. V). Notice that BPDN needs 90% measurements for its sufficient 
conditions to start holding (ERC to become positive) where as mod-BPDN only needs 19%. Moreover, 
even with 90% measurements, the ERC of BPDN is just positive and very small. As a result, its error 
bound is large (27% normalized mean squared error (NMSE)). Similarly, notice that mod-BPDN needs 
n > 19% for its sufficient conditions to start holding (AtuA to become full rank which is needed for 
Qt,o(A) to be invertible). For reg-mod-BPDN which only needs A a to be full rank, n = 13% suffices. 

Remark 3: A sufficient conditions' comparison only provides a comparison of when a given result 
can be applied to provide a bound on the reconstruction error. In other words, it tells us under what 
conditions we can guarantee that the reconstruction error of a given approach will be small (below a 
bound). Of course this does not mean that we cannot get small error even when the sufficient condition 
does not hold, e.g., in simulations, BPDN provides a good reconstruction using much lesser than 90% 
measurements. However, when n < 90% we cannot bound its reconstruction error using Theorem 1 
above. 

Due to lack of space, we have removed the discussion relating the ERC to the RIP constants and 
then using the resulting lower bound for a sufficient conditions' comparison. We have moved it to the 
Supplementary Material. 

C. Proof Outline 

To prove Theorem [TJ we use the following approach motivated by that of 0. 

3 If we bounded the £00 norm of the error as done in (5) we would get a looser upper bound on the allowed 7's for this. 
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1) We first bound \\dr,\{A) — cta(A)||2 by simplifying the necessary and sufficient condition for it 
to be the minimizer of L(b) when b is supported onTuA. This is done in Lemma 1 in Appendix 

m 

2) We bound ||ct,a(A)— x\\2 using the expression for ct,a(A) in (l25l ) and substituting y = AtuA%tuA+ 
w in it (recall that x is zero outside T U A). This is done in Lemma 2 in Appendix |B] 

3) We can bound \\dr,\{A) — x\\2 using the above two bounds and the triangle inequality. 

4) We use an approach similar to (3J Lemma 6] to find the sufficient conditions under which dr,A(A) 
is also the unconstrained unique minimizer of L(b), i.e. x = dr,\{A)- This is done in Lemma 3 in 
Appendix |B] 

The last step (Lemma 3) helps prove the first two parts of Theorem 1. Combining the above four steps, 
we get the third part (error bound). We give the lemmas in Appendix |Bj They are proved in Appendix 
IDTI lD2l and lD3l 

Two key differences in the above approach with respect to the result of are 

• ctx(A) is the regularized LS estimate instead of the LS estimate in (3J. This helps obtain a better 
and simpler error bound of reg-mod-BPDN than when using the LS estimate. Of course, when A = 
(mod-BPDN or BPDN), c T ,o(A) is just the LS estimate again. 

• For reg-mod-BPDN (and also for mod-BPDN), the subgradient set of the l\ term is d\\bT<= \\i\b=d T a (A) 
and so any (j) in this set is zero on T, and only has ||0a||cc < 1- Since |A| <C \N\, this helps to get 
a tighter bound on ||cr^(A) — dr,\{A)\\2 i n ste P 1 above as compared to that for BPDN (3J (see 
proof of Lemma 1 for details). 

IV. Tighter Bounds without Sufficient Conditions 

The problem with the error bounds for reg-mod-BPDN, mod-BPDN, BPDN or LS-CS (271 is that 
they all hold under different sufficient conditions. This makes it difficult to compare them. Moreover, the 
bound is particularly loose when n is such that the sufficient conditions just get satisfied. This is because 
the ERC is just positive but very small (resulting in a very large 7* and hence a very large bound). To 
address this issue, in this section, we obtain a bound that holds without any sufficient conditions and that 
is also tighter, while still being computable. The key idea that we use is as follows: 

• we modify Theorem 1 to hold for "sparse-compressible" signals (271 . i.e. for sparse signals, x, in 
which some nonzero coefficients out of the set A are small ("compressible") compared to the rest; 
and then 

• we minimize the resulting bound over all allowed split-ups of x into non-compressible and com- 
pressible parts. 

Let A C A be such that the conditions of Theorem 1 hold for it. Then the first step involves modifying 
Theorem 1 to bound the error for reconstructing x when we treat x A ^ A as the "compressible" part. The 
main difference here is in bounding ||ct,a(A) —x\\2 which now has a larger bound because of x A \ A . We 
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do this in Lemma [4] in the Appendix O Notice from the proofs of Lemma 1 and Lemma 3 in Appendix 
ID II and |D3l that nothing in their result changes if we replace A by a A C A. Combining Lemma 4 with 
Lemmas 1 and 3 applied for A instead of A leads to the following corollary. 

Corollary 2: Consider a A C A. If Q T ,\(A) is invertible, ERC T ,x(A) > 0, and 7 = A (A), then 

||x-£|| 2 </(T,A,A,A,7t,a(A)) (29) 

where 

f(T, A, A, A, 7) 4 7 ^|A|/ 1 (A) + Xf 2 (A)\\xT - At|| 2 

+/ 3 (A)|| w || 2 + / 4 (A)||x mA || 2 , (30) 



/ 4 (A)^^||Q TiA (A)-M TuA '^ A ||| + l, (31) 

/i(0./2(-). /s(-) are defined in £5) and 7 ^ A (A) in 437>. 
Proof: The proof is given in Appendix IC1I 

In order to get a bound that depends only on \\xt~ At||2> II^avaII 2 ' tne n °i se > w > and the sets T, A, A e , 
we can further bound 7 £ A (A) by rewriting y — ^ct 5 a( A ) = — ct a(A)) + w and then bounding 
\\ x ~ (ct,a(A))||2 using Lemma 4. Doing this gives the following corollary. 

Corollary 3: If Q t ,a( A ) is invertible, ERC T ,\{&) > 0, and 7 = 7^ A (A), then 



where 



\x-x\\ 2 <g(A) (32) 



g(A)=gi\\x T - /} T || 2 + 52IMI2 +S3||za\aII2 +54 (33) 



|A| /l( A)maxcor(A) 

51=A/2(A)( — ^aTaI — + 

^ ^/ 1 (A)/ 3 (A)maxcor(A) 

92 = El^iK) + /3(A) ' 

^ ^/ 1 (A)/ 4 (A)maxcor(A) 

93 = e^JK) + /4(A) ' 



A 

54 = 



I A III^(tua/HI°o/i(A) 
ERC T ^(A) 



maxcor(A) = max ||-A/-Atua||2, 

j£(TuA)<= 

/i(-)>/20> h{-) and / 4 (-) are defined in gg) and (HQ), and 7^ A ( A ) in <EU>- 
Proof: The proof is given in Appendix |C2| 

Using the above corollary and minimizing over all allowed A's, we get the following result. 
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Theorem 2: Let 



A* = argmingi(A) (34) 



where 

g = {A : A C A, ERC T ,\(A) > 0, Q T ,x{A) is invertible} (35) 

If 7 = 7 * A (A*), then 

1) L(6) has a unique minimizer, x, supported on T U A*. 

2) The error bound is 

||x-s|| 2 <<7(A*) (36) 

(7* A (A) is defined in O). 

Proof: This result follows by minimizing over all allowed A's from Corollary [3j 
Compare Theorem 2 with Theorem 1. Theorem 1 holds only when the complete set A belongs to Q, 
where as Theorem 2 holds always (we only need to set 7 appropriately). Moreover, even when A does 
belong to Q, Theorem 1 gives the error bound by choosing A* = A. However, the above result minimizes 
over all allowed A's, thus giving a tighter bound, especially for the case when the sufficient conditions 
of Theorem 1 just get satisfied and ERCt,\(A) is positive but very small. A similar comparison also 
holds for the mod-BPDN and BPDN results. 

The problem with Theorem 2 is that its bound is not computable (the computational cost is exponential 
in |A|). Notice that <?(A*) := min^ g g g(A) can be rewritten as 

of A*) = min of A) = min min of A) where 

Aeg o<fe<|A| g k yv ; 

<5fc = <? n {A C A : |A| = k} (37) 

Let d := |A|. The minimization over Q k is expensive since it requires searching over all (^) size k 
subsets of A to first find which ones belong to Q k and then find the minimum over all A C Q k . The 
total computation cost to do the former for all sets Qo,Gi, ■ ■ - Gd is O(^fc=o (fc)) = 0(2 d ), i.e. it is 
exponential in d. This makes the bound computation intractable for large problems. 

A. Obtaining a Computable Bound 

In most cases of practical interest, the term that has the maximum variability over different sets in Q k 
is ||x A ^||2- The multipliers g\, gi, gz and 54 vary very slightly for different sets in a given Q\.. Using 
this fact, we can obtain the following upper bound on ming fc g(A) which is only slightly looser and also 
holds without sufficient conditions, but is computable in polynomial time. 

Define A**(k) and B/. as follows 

A**(/c) = arg min H^maHs 
{ACA,|A|=fc} x 

A j g(A**(k)) if A**(k)€G k 
Bk=\ . (3s) 

00 otherwise 
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Then, clearly 

mmg(A)<B k (39) 

Qk 

since ming fe g(A) < g(A) for any A £ Qk and it is also less than infinity. For any k, the set A**(k) can 
be obtained by sorting the elements of xa in decreasing order of magnitude and letting A** (k) contain 
the indices of the k largest elements. Doing this takes 0(d log d) time since sorting takes 0(d log d) time. 
Computation of requires matrix multiplications and inversions which are 0(k 3 ). Thus, the total cost 
of doing this is at most 0(d 4 ) which is still polynomial in d. 

Therefore, we get the following bound that is computable in polynomial time and that still holds 
without sufficient conditions and is much tighter than Theorem 1. 

Theorem 3: Let 

kmm — ar g mm and 

0<fc<|A| 

A**^A**(A; min ) (40) 

where B k and A**(k) are defined in C£Q. If 7 = 7r, A (^**)> 

1) L(b) has a unique minimizer, x, supported on T U A**. 

2) The error bound is 

\\x - x\\ 2 < g(A**) (41) 

(7* A (A) is defined in d27b). 

Corollary 4 (corollaries for mod-BPDN and BPDN): The result for mod-BPDN follows by setting 
A = in Theorem [3] The result for BPDN follows by setting A = 0, T = (and so A = N) in 
Theorem [3] 

When n and s = \N\ are large enough, the above bound is either only slightly larger, or often actually 
equal, to that of Theorem 2 (e.g. in Fig. [3a), m = 256, n = 0.13m = 33, s = 0.1m = 26). The 
reason for the equality is that the minimizing value of k is the one that is small enough to ensure that 
91) 92, g3, 54 are small. When k is small, g\, g 2 , 53, 94, ERC and Q(A) have very similar values for 
all sets A of the same size k. In d33l , the only term with significant variability for different sets A of 
the same size k is ||x A ^||2- Thus, (a) argming fc g(A) = argming fc ||a; A ^||2 and (b) Qk is equal to 
{A C A, |A| = k}. Thus, d39l holds with equality and so the bounds from Theorems 3 and 2 are equal. 
As n and s = \N\ approach infinity, one can, in fact, use a law of large numbers (LLN) argument to 
prove that both bounds will be equal with high probability (w.h.p.). The key idea here is the same as 
above: we show that as n,s go to infinity, w.h.p., g\, g 2 , 53, 54, Q and ERC are equal for all sets A 
of any given size k. The main idea of this argument is given in the Supplementary Material. It will be 
developed in future work. 
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V. Numerical Experiments 

In this section, we show both upper bound comparisons and actual reconstruction error comparisons. 
The upper bound comparison only tells us that the performance guarantees of reg-mod-BPDN are better 
than those for the other methods. To actually demonstrate that reg-mod-BPDN outperforms the others, we 
need to compare the actual reconstruction errors. Based on useful comments by anonymous reviewers, 
we have reorganized this section. After giving the simulation model, we first show the reconstruction 
error comparisons for recovering simulated sparse signals from random Gaussian measurements. Next, 
we show comparisons for recursive dynamic MRI reconstruction of a larynx image sequence. In this 
comparison, we also show the usefulness of the Theorem 3 in helping us select a good value of 7. In 
the last three subsections, we show numerical comparisons of the results of the various theorems. The 
upper bound comparisons of Theorem 3 and the comparison of the corresponding reconstruction errors 
suggests that the bounds for reg-mod-BPDN and BPDN are tight under the scenarios evaluated. Hence, 
they can be used as a proxy to decide which algorithm to use when. We show this for both random 
Gaussian and partial Fourier measurements. 

A. Simulation Model 

The notation z = ±a means that we generate each element of the vector z independently and each 
is either +a or —a with probability 1/2. The notation v ~ A/"(0, S) means that v is generated from a 
Gaussian distribution with mean and covariance matrix S. We use [a\ to denote the largest integer less 
than or equal to a. Independent and identically distributed is abbreviated as iid. Also, N-RMSE refers to 
the normalized root mean squared error. 

We use the recursive reconstruction application |[T6l . iflOl to motivate the simulation model. In this case, 
assuming that slow support and slow signal value change hold [see Fig. [Q, we can use the reconstructed 
value of the signal at the previous time as pL and its support as T. To simulate the effect of slow signal 
value change, we let xn = /xjy + v where v is a small iid Gaussian deviation and we let jlrnN = i^rnN 
(and so x T nN = ft-TnN + vtdn)- 

The extras' set, A e = T \ 7Y, contains elements that got removed from the support at the current time 
or at a few previous times (but so far did not get removed from the support estimate). In most practical 
applications, only small valued elements at the previous time get removed from the support and hence 
the magnitude of fi on A e will be small. We use /3 S to denote this small magnitude, i.e. we simulate 

(A)A. = ±P>. 

The misses' set at time t, A, definitely includes the elements that just got added to the support at 
t or the ones that previously got added but did not get detected into the support estimate so far. The 
new elements typically get added at a small value and their value slowly increases to a large one. Thus, 
elements in A will either have small magnitude (corresponding to the current newly added ones), or will 
have larger magnitude but still smaller than that of elements already in NnT. To simulate this, we do the 
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following, (a) We simulate the elements on NnT to have large magnitude, /3i, i.e. we let (fi)NnT = ±A- 
(b) We split the set A into two disjoint parts, Ai and A2 = A \ Ai. The set Ai contains the small (e.g. 
newly added) elements, i.e. (ju)ai = ±/3 s - The set A2 contains the larger elements, though still with 
magnitudes smaller than those in TV n T, i.e. (h)a 2 = ^Pm, where /3; > fi m > f3 s . 
In summary, we use the following simulation model. 

(x) N = (fi) N + v, v ~ JV(0, ap) 

(x) N o=0 (42) 
where (/x)jvnT = ±A 

(/l) Al =±/3 s , (fl)A a =±P m 

(h)n°=0 (43) 

and 

(A)ttw = (h)thn = ±A 

(A) T c = (44) 

We generate the support of x, N, of size \N\, uniformly at random from [1, ...,m]. We generate A 
with size |A| and A e with size |A e | uniformly at random from N and from N c respectively. The set 
Ai of size |Ai| = [|A|/2j is generated uniformly at random from A. The set A2 = A \ Ai. We let 
T = N U A e \ A. We generate fj, and then x using (|43l and (l42l . We generate pL using (l44l . 

In some simulations, we simulated the more difficult case where /3 m = f3 s . In this case, all elements 
on A were identically generated and hence we did not need Ai. 

B. Reconstruction Error Comparisons 

In Fig. [3l we compare the Monte Carlo average of the reconstruction error of reg-mod-BPDN with 
that of mod-BPDN, BPDN, weighted t\ HJ] given in C[0]>, CS-residual given in (fTTb . CS -mod-residual 
given in (fT2l and modified-CS-residual (25l given in (fT3T ). Simulation was done according to the model 
specified above. We used random Gaussian measurements in this simulation, i.e. we generated A as an 
n x m matrix with iid zero mean Gaussian entries and normalized each column to unit £2 norm. 

We experimented with two choices of n, n = 0.13m (where reg-mod-BPDN outperforms mod-BPDN) 
and n = 0.3m (where both are similar) and two values of oi, ai = 0.001 (good prior) and ai = 0.1 
(bad prior). For the cases of Fig Oa) (n = 0.13m, a^ = 0.001) and Fig 0b) (n = 0.13m, a^ = 0.1), 
we used signal length m = 256, support size \N\ = 0.1m = 26 and support extras' size, |A e | = 0.1|iV|. 
The misses' size, |A|, was varied between and 0.2|iV| (these numbers were motivated by the medical 
imaging application, we used larger numbers than what are shown in Fig. Q]). We used /3/ = 1, /3 m = 0.4 
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Fig. 3. The N-RMSE for reg-mod-BPDN, mod-BPDN, BPDN, LS-CS, KF-CS, weighted l x , CS-residual, CS-mod- 
residual and modified-CS-residual are plotted. For n = 0.13m , reg-mod-BPDN has smaller errors than those of 
mod-BPDN and the gap is larger when the signal estimate is good. For n = 0.3?n, the errors of reg-mod-BPDN, 
mod-BPDN and weighted l\ are close and all small. 



and /3 S = 0.2. The noise variance was = 10 -5 . For the last two figures, Fig (3jc) (n = 0.3m, 
dp = 0.001) and Fig [3£d) (n = 0.3m, ct| = 0.1), for which n was larger, we used /3 m = p s = 0.25 
which is a more difficult case for reg-mod-BPDN. For Fig. [He), we also used a larger noise variance 
cr^ = 10~ 4 . All other parameters were the same. 

For applications where some training data is available, 7 and A for reg-mod-BPDN can be chosen by 
interpreting the reg-mod-BPDN solution as the maximum a posteriori (MAP) estimate under a certain 
prior signal model (assume xt is Gaussian with mean fix and variance ai and xt<= is independent of 
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Fig. 4. Reconstructing a 32 x 32 block of the actual (compressible) larynx sequence from partial Fourier 
measurements. Measurements n = 0.18m for t = and n = 0.06m for t > 0. Reg-mod-BPDN has the smallest 
reconstruction error among all methods. 



xt and is iid Laplacian with parameter b). This idea is explained in detail in ifTOl . However, there is 
no easy way to do this for the other methods. Alternatively, choosing 7 and A according to Theorem 
3 gives another good start point. We can do this for mod-BPDN and BPDN, but we cannot do this for 
the other methods (we show examples using this approach later). For a fair error comparison, for each 
algorithm, we selected 7 from a set of values [0.00001 0.00005 0.0001 0.0005 0.001 0.005 0.01 0.1]. 
We tried all these values for a small number of simulations (10 simulations) and then picked the best 
one (one with the smallest N-RMSE) for each algorithm. For weighted l\ reconstruction, we also pick 
the best 7' in (fTOl) from the same set in the same wajfl For reg-mod-BPDN, A should be larger when 
the signal estimate is good and should be decreased when the signal estimate is not so good. We can use 
A = acr^/cjp to adaptively determine its value for different choices of cr^ and ai. In our simulations, 
we used a = 0.2 for Fig. [3] (a), (b) and (d) and a = 0.05 for Fig. fSc). 

We fixed the chosen 7, 7' and A and did Monte Carlo averaging over 100 simulations. We conclude 
the following. (1) When the signal estimate is not good (Fig. [3£b),(d)) or when n is small (Fig. Oa),(b)), 
CS-residual and CS -mod-residual have significantly larger error than reg-mod-BPDN. (2) In case of Fig. 

(n = 0.3m), they also have larger error than mod-BPDN. (3) In all four cases, weighed t\ and mod- 
BPDN have similar performance. This is also similar to that of reg-mod-BPDN in case of n = 0.3m, 
but is much worse in case of n = 0.13m. (4) We also show a comparison with regmodBPDN-var in Fig. 
Oa). Notice that it has larger errors than reg-mod-BPDN for reasons explained in Sec. I-C. 

4 To give an example, our finally selected numbers for Fig.^d) were 7 = 0.01, 0.001, 0.001, 0.001, 0.001, 0.001, 0.01, 0.01 
for BPDN, mod-BPDN, reg-mod-BPDN, weighted l\, LS-CS, CS-residual, CS-mod-residual, mod-CS-residual respectively and 
7' = 0.0001 
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C. Dynamic MRI application using 7 from Theorem 3 

In Fig.@l we show comparisons for simulated dynamic MR imaging of an actual larynx image sequence 
(Fig. Q] (a)(i)). The larynx image is not exactly sparse but is only compressible in the wavelet domain. 
We used a two-level Daubechies-4 2D discrete wavelet transform (DWT). The 99%-energy support size 
of its wavelet transform vector, |jVt| ~ 0.07m. Also, |At| ~ O.OObn and |A e> t| 0.002m. We used a 
32 x 32 block of this sequence and at each time and simulated undersampled MRI, i.e. we selected n 2D 
discrete Fourier transform (DFT) coefficients using the variable density sampling scheme of ll28l . and 
added iid Gaussian noise with zero mean and variance = 10 to each of them. Using a small 32 x 32 
block allows easy implementation using CVX (for full sized image sequences, one needs specialized 
code). We used no = 0.18m at t = and n = 0.06m at t > 0. 

We implemented dynamic reg-mod-BPDN as described in Algorithm 1. In this problem, the matrix A = 
F u -W~ l where F u contains the selected rows of the 2D-DFT matrix and W is the inverse 2D-DWT matrix 
(for a two-level Daubechies-4 wavelet). Reg-mod-BPDN was compared with similarly implemented reg- 
mod-BPDN-var and CS-residual algorithms (CS-residual only solved simple BPDN at t = 0). We also 
compared with simple BPDN (BPDN done for each frame separately). For reg-mod-BPDN and reg-mod- 
BPDN- var, the support estimation threshold, p, was chosen as suggested in |[T0l : we used p = 20 which is 
slightly larger than the smallest magnitude element in the 99%-energy support which is 15. At t = 0, we 
used To to be the set of indices of the wavelet approximation coefficients. To choose 7 and A we tried two 
different things, (a) We used A and 7 from the set [0.00001 0.00005 0.0001 0.0005 0.001 0.005 0.01 0.1] 
to do the reconstruction for a short training sequence (5 frames), and used the average error to pick the 
best A and 7. We call the resulting reconstruction error plot reg-mod-BPDN-opt. (b) We computed the 
average of the 7* obtained from Theorem 3 for the 5-frame training sequence and used this as 7 for the 
test sequence. We selected A from the above set by choosing the one that minimizes the average of the 
bound of Theorem 3 for the 5 frames. We call the resulting error plot reg-mod-BPDN-7*. The same two 
things were also done for BPDN and CS-residual as well. For reg-mod-BPDN-var, we only did (a). 

From Fig. |4j we can conclude the following. (1) Reg-mod-BPDN significantly outperforms the other 
methods when using so few measurements. (2) Reg-mod-BPDN-var and reg-mod-BPDN have similar 
performance in this case. (3) The reconstruction performance of reg-mod-BPDN using 7* from Theorem 
3 is close to that of reg-mod-BPDN using the best 7 chosen from a large set. This indicates that Theorem 
3 provides a good way to select 7 in practice. 

D. Comparing the result of Theorem 1 

In Table I, we compare the result of Theorem 1 for reg-mod-BPDN, mod-BPDN and BPDN. We used 
m = 256, \N\ = 26 = 0.1m, |A| = 0.04|iV| = |A e |, a 2 p = 10" 3 , A = 1 and /3 m = /3 S = 0.25. Also, 
= 10~ 5 and we varied n. For each experiment with a given n, we did the following. We did 100 
Monte Carlo simulations. Each time, we evaluated the sufficient conditions for the bound of reg-mod- 
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BPDN to hold. We say the bound holds if all the sufficient conditions hold for at least 98 realizations. 
If this did not happen, we record not hold in Table I. If this did happen, then we recorded ^ ^""^ I 
where E[-] denotes the Monte Carlo average computed over those realizations for which the sufficient 
conditions do hold. Here, "bound" refers to the right hand side of (1281 ) computed with 7 = 7^ A (A) 
given in (T27T ). An analogous procedure was followed for both mod-BPDN and BPDN. 

The comparisons are summarized in Table I. For reg-mod-BPDN, we selected A from the set 
[0.00001 0.00005 0.0001 0.0005 0.001 0.005 0.01 0.1] by picking the one that gave the smallest bound. 
Clearly the reg-mod-BPDN result holds with the smallest n, while the BPDN result needs a very large 
n (n > 90%). Also even with n = 90%, the BPDN error bound is very large. 



n 


Reg-mod-BPDN 


Mod-BPDN 


BPDN 


0.13m 


0.885 


not hold 


not hold 


0.19m 


0.161 


0.303 


not hold 


0.5m 


0.0199 


0.0199 


not hold 


0.9m 


0.014 


0.014 


0.27 



TABLE I 

Sufficient conditions and normalized bounds comparison of reg-mod-BPDN, mod-BPDN and 



BPDN. Signal length m = 256, support size \N\ = 0.1m, |A| = 4%\N\, A e = 4%|7V|, a 2 u 



10" 



AND 



10 



-3 <• 



NOT HOLD" MEANS THE ONE OR ALL OF THE SUFFICIENT CONDITIONS DOES NOT HOLD. 



E. Comparing Theorems 1, 2, 3 

In Fig. [5](a), we compare the results from Theorems 1, 2 and 3 for one simulation. We plot for 
|A|/|iV| ranging from to 0.2. Also, we used m = 256, \N\ = 26, |A e | = 0.1|iV|, o 2 v = 10~ 3 , A = 1 and 
fim = fis = 0.25. Also, n = 0.13m and a\, = 10 -5 . We used 7 = 7* given in the respective theorems, 
and we set A = lOo^/dp. We notice the following. (1) The bound of Theorem 1 is much larger for than 
that of Theorem 2 or 3, even for |A| = 0.04|iV|. (2) For larger values of |A|, the sufficient conditions of 
Theorem 1 do not hold and hence it does not provide a bound at all. (3) For reasons explained in Sec. 
IV, in this case, the bound of Theorem 3 is equal to that of Theorem 2. Recall that the computational 
complexity of the bound from Theorem 2 is exponential in |A|. However if |A| is small, e.g. in our 
simulations |A| < 5, this is still doable. 

F. Upper bound comparisons using Theorem 3 

In Fig. [3b), we do two things. (1) We compare the reconstruction error bounds from Theorem 3 for 
reg-mod-BPDN, mod-BPDN and BPDN and compare them with the bounds for LS-CS error given in |27l 
Corollary 1]. All bounds hold without any sufficient conditions which is what makes this comparison 
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Fig. 5. In (a), we compare the three bounds from Theorem 1, 2 and 3 for one realization of x. In (b) and (c), 
we compare the normalized average bounds from Theorem 3 and reconstruction errors with random Gaussian and 
partial Fourier measurements respectively. 



possible. (2) We also use the 7* given by Theorem 3 to obtain the reconstructions and compute the 
Monte Carlo averaged N-RMSE. Comparing this with the Monte Carlo averaged upper bound on the 
N-RMSE, J allows us to evaluate the tightness of a bound. Here E[-] denotes the mean computed 
over 100 Monte Carlo simulations and "bound" refers to the right hand side of (I4TT) . We used m = 256, 
\N\ = 26, |A e | = 0.1|JV|, (?l = 10" 3 , pi = 1, p m = p a = 0.25, and |A| was varied from to 0.2|iV|. 
Also, n = 0.13m and cr^ = 10 -5 . 

From the figure, we can observe the following. (1) Reg-mod-BPDN has much smaller bounds than those 
of mod-BPDN, BPDN and LS-CS. The differences between reg-mod-BPDN and mod-BPDN bounds is 
minor when |A| is small but increases as |A| increases. (2) The conclusions from the reconstruction error 
comparisons are similar to those seen from the bound comparisons, indicating that the bound can serve 
as a useful proxy to decide which algorithm to use when (notice bound computation is much faster than 
computing the reconstruction error). (3) Also, reg-mod-BPDN and mod-BPDN bounds are quite tight as 
compared to the LS-CS bound. BPDN bound and error are both 100%. 100% error is seen because the 
reconstruction is the all zeros' vector. 

In Fig. [3c), we did a similar set of experiments for the case where A corresponds to a simulated 
MRI experiment, i.e. A = F u • W~ x where F u contains randomly selected rows of the 2D-DFT matrix 
and W is the inverse 2D-DWT matrix (for a two-level Daubechies-4 wavelet). We used n = 0.17m and 
a\j = 10~ 3 . All other parameters were the same as in Fig. [3b). Our conclusions are also the same. 

The complexity for Theorem 3 is polynomial in |A| where as that of the LS-CS bound j27l Corollary 
1] is exponential in |A|. To also show comparison with the LS-CS bound, we had to choose a small value 
of m = 256 so that the maximum value of |A| = 0.2|iV| = 5 was small enough. In terms of MATLAB 
time, computation of the Theorem 3 bound for reg-mod-BPDN took 0.2 seconds while computing the 
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LS-CS bound took 1.2 seconds. For all methods except LS-CS, we were able to do the same thing fairly 
quickly even for m = 4096, or even larger. It took only 8 seconds to compute the bound of Theorem 3 
when m = 4096, n = 0.13m, \N\ = 410 = 0.1m and |A| = |A e | = 0.1|iV| = 41. The LS-CS bound, 
whose complexity is 0(2 41 ) was taking too long (it did not get done even in a few minutes and we 
stopped the computation after that). The same was also true for the Theorem 2 bound which also has 
exponential complexity. 

VI. Conclusions and Future Work 

In this work we studied the problem of sparse reconstruction from noisy undersampled measurements 
when partial and partly erroneous, knowledge of the signal's support and an erroneous estimate of the 
signal values on the "partly known support" is also available. Denote the support knowledge by T and 
the signal value estimate on T by pLx- We proposed and studied a solution called regularized modified- 
BPDN which tries to find the signal that is sparsest outside T, while being "close enough" to /zy on 
T, and while satisfying the data constraint. We showed how to obtain computable error bounds that 
hold without any sufficient conditions. This made it easy to compare bounds for the various approaches 
(corresponding results for modified-BPDN and BPDN follow as direct corollaries). Exhaustive empirical 
error comparisons with these and many other existing approaches are also provided. 

In ongoing work, we are evaluating the utility of reg-mod-BPDN for recursive functional MR imaging 
to detect brain activation patterns in response to stimuli. On the other end, we are also working on 
obtaining conditions under which it will remain "stable" (its error will be bounded by a time-invariant 
and small value) for a recursive recovery problem |[26l . 

Appendix 

A. Proof of Proposition 1 

When A = 0, Qt,o(S) = A TI js'A TI j S . Thus, Qt,x(S) is invertible iff A TuS is full rank. When A > 0, 
Qt,x(S) is as defined in (l20l . Apply block matrix inversion lemma 

A B 
C D 

(A-BD^C) 1 —(A — BD 1 C)~ 1 BD 1 

-D 1 C(A - BD^C) 1 D 1 + D X C(A - BD 1 C) 1 BD 1 

with A = A T ' A T + XI T , B = A T 'A S , C = A S 'A T and D = A S 'A S , clearly Qt,x(S) is invertible iff 
A s ' A s and A T 'RA T + \I T are invertible where R := [I - A S (A S ' As)" 1 A' s }. When A s is full rank, (i) 
As' As is full rank; and (ii) R is a projection matrix. Thus R = R'R and so At'RAt = (RAt)'(RAt) 
is positive semi-definite. As a result, At'RAt + XIt is positive definite and thus invertible. Hence, when 
As is invertible, Qt,\{S) is also invertible. 
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B. Proof of Theorem 1 

In this subsection, we give the three lemmas for the proof of Theorem 1 . To keep notation simple we 
remove the subscripts T,xfrom Q(A), M, P(A), d(A), c(A), ERC(A) in this and other Appendices. 
Lemma 1: Suppose that Q{A) is invertible, then 

||d(A)-c(A)|| 2 < 7 yjAl-/i(A) (45) 

Lemma 1 can be obtained by setting VL(b) = and then using block matrix inversion on Q(A). The 
proof of Lemma 1 is in Appendix ID 1 1 Next, ||c(A) — x\\ 2 can be bounded using the following lemma. 
Lemma 2: Suppose that Q(A) is invertible. Then 

||c(A) - x\\ 2 < Xf 2 {A)\\x T - At|| 2 + / 3 (A)H| 2 ( 46 > 

The proof of Lemma 2 is in Appendix ID2I 

Lemma 3: If Q(A) is invertible, ERC(A) > 0, and 7 > 7* (A), then L(b) has a unique minimizer 

which is equal to d(A) . 

Lemma 3 can be obtained in a fashion similar to (3], HI- Its proof is given in Appendix |D3i 
Combining Lemmas 1, 2 and 3, and using the fact ||d(A) — x\\ 2 < IM(A) — c(A) || 2 + ||c(A) — x\\ 2 , 

we get Theorem 1. 

C. Proof of Theorem 2 

The following lemma is needed for the proof of the corollaries leading to Theorem 2. 
Lemma 4: Suppose that Q(A) is invertible. Then 

||c(A) -x\\ 2 < 

Xf 2 (A)\\x T - fah + / 3 (A)|M| 2 + /4(A)||x AxA || 2 (47) 

Since c(A) is only supported on T U A and y = A Tu ^x Tu ^ + ^4 A \a x a\A + w > me ^ ast ' :erm °^ 63) 
can be obtained by separating out. The proof of Lemma [4] is given in Appendix ID4I 

Using Lemma 4, we can obtain Corollary 1 and then Corollary 2. Then minimize over all allowed A's 
in Corollary 2, we get Theorem 2. The proof of Corollary 1 and 2 are given as follows. 

1) Proof of Corollary 1: Notice from the proof of Lemma 1 and Lemma 3 that nothing in the result 
changes if we replace A by a A C A. By Lemma 1 for A, we are able to bound ||d(A) — c(A)||2- 
Hence, we get the first term of (l30l ). Next, invoke Lemma 4 to bound ||c(A) — x\\ 2 and we can obtain the 
rest three terms of (l30l) . Lemma 3 for A gives the sufficient conditions under which d(A) is the unique 
unconstrained minimizer of L(b). 
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2) Proof of Corollary 2: Corollary 2 is obtained by bounding 7* (A). 7* (A) 



\A 



(TuA) 



■'(y 



Ac(A))\\ oa /ERC(A) can be bounded by rewriting y — Ac(A) = Atua(xtuA — (c(A))tua) + w and 
then bounding ||xtuA — {c{A))t\ja\\2 = \\ x ~ c( A.) || 2 using Lemma 4. Doing this, we get 



1-4 



{ TuAy'(y-M&))\\oo 

< max \Ai Atua{xtuA ~ (c(A)) Tu a)| + \Ai'w\ 

i^TUA 

< max \\AiATuAh\\xTuA- (c(A))TuA)h + \Ai'w\ 

i^TuA 

< maxcor(A)A/ 2 (A)||x T - fi T h + maxcor(A)/ 3 (A)||u;||2 

+maxcor(A)/ 4 (A)||3; A ^ A ||2 + P (TuA) c'w||oo 

Using the above inequality to bound 7* (A) and replacing 7 in f(T, A, A, A, 7), given in (l30l ). by this 
bound, we can get (l32l . 



D. Proof of Lemmas 1, 2, 3, 4 

1) Proof of Lemma 1: We use the approach of (3] Lemma 3]. We can minimize the function L(b) 
over all vectors supported on set T U A by minimizing: 



F(b) 



-\\y - A TuA b T uA\\l + ^A||6 T - /Hli + 7||Mb 



(48) 



Since Q(A) is invertible, -F(fe) is strictly convex as a function of &tuA- Then at the unique minimizer, 
d(A), G VF(6)|(,=rf(A)- Let 9||6j"=||i|6=d(A) denote the subgradient set of ||&t c ||i at 6 = d(A). Then 
clearly any <j) in this set satisfies 



(j) T = 
||0r«||oo<l 

Now, G VF(6)| 6=d ( A ) implies that 

(A TU A'^4TuA)[d(A)]ruA - ^tua'i/ 
[d(A)] T - At 



+A 



Oa 



+ 70TUA = 



Simplifying the above equation, we get 

[d(A)] TuA = QiAr^AruA'y + X 
Therefore, using d49l and ( T25T ), we have 

[c(A)] TuA - [d(A)]ruA = Q(A)" 1 



Oa 



70tua) 



T 

7<M 



(49) 
(50) 



(51) 



(52) 



(53) 
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Since 



0(A) 



A T ' A T + Mr A t 'Aa 
Aa'At Aa'Aa 



(54) 



using the block matrix inversion lemma 

A B 
C D 

"A" 1 + A X B(D - CA^BJ^CA -1 -A X B(D - CA^B)" 1 " 
-(D - CA- 1 B)- 1 CA _1 (D-CA X B) 1 

with A = At' At + XIt, B = At'Aa, C = Aa At and D = Aa ' Aa and using 4>t = 0, we obtain 

[c(A)] TuA - [d(A)] TU A = 

--/{A T ' A T + AIit^-^tAa^a'M^a) -1 ^ 
7(A a 'MA a )-Va 
Since ||<7ja||oo < 1, the bound of d45l ) follows. 

2) Proof of Lemma 2: Recall c(A) is given in (l25l) . Since both x and c(A) are zero outside T U A, 
then ||c(A) — 1 1 2 = ||[c(A)]t uA — xtua||2- With y = Ax + w and Ac = AtuAXtuA, we have 

-4ruA'y = Atua (Atuaxtua + w) (55) 
It {jt *? 

Notice A' rU A^ruA = 0(A) - A 



O5 T O5 5 

AtuAV = 0(A)xtua - A 
Then, using (l25l) we can obtain 

[c(A)]tua - x tua = AQ(A)- 1 



. Using (1531 ). we obtain the following equation 

+ Atua'w 



X T 

0a 



(56) 



A 



+ Q(A)- 1 Atua'w 



Finally, this gives (1461) . 

3) Proof of Lemma 3: The proof is similar to that in (3J and HI. Recall that d(A) minimizes the 
function L(b) over all b supported on T U A. We need to show that if 7 > 7* (A), then d(A) is the 
unique global minimizer of L(b). 

The idea is to prove under the given condition, any small perturbation h on d(A) will increase function 
L(d(A)),i.e. L(d(A) + h) — L(d(A)) > 0, V^Hoo < e for e small enough. Then since L(b) is a convex 
function, d(A) will be the unique global minimizer ||3l. 

Similar to CD, we first split the perturbation into two parts h = u + v where u is supported on TUA 
and v is supported on (TU A) c . Clearly ||u||oo < IHloo < e- We consider the case v 7^ since the case 
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v = is already covered in Lemma 1. Then 



L(d(A) + h)= l -\\y- A(d(A) + u) - Av\\j + 
-A|| [d(A)] T + u T + v T - (i T \\l + j\\(d(A) + u) T . + u T c || a 
Then, we can obtain 

L(d(A) + h)- L(d(A)) = L(d(A) + u) - L(d(A)) 

+ 211^111 - (y- AcZ(A),At>) + (An, Aw) + T 1 1 ||i 

Since d{A) minimizes L(b) over all vectors supported on TU A, L(d(A) + u) — L(d(A)) > 0. Then since 
L(d(A) + u) — L(d(A)) > and ||A«||| > 0, we need to prove that the rest are positive,i.e.,7||uT c ||i — 
(y — Ad(A),Av) + (Au, Av) > 0. Instead, we can prove this by proving a stronger condition 7||i>t c ||i — 
\(y - Ad(A),Av)\ - \ {Au,Av)\ > 0. Since (y - Ad(A), Av) = v'A'(y - Ad(A)) and v is supported on 
(TU A) c , 



Thus, 



Meanwhile, 



\{y - Ad(A),Av)\ = \v iTuA) JA {TuA y'(y - Ad(A))\ 

<ll«l|lM(TUA)«'(j/-M^))lloo 



\(y- Ad(A),Av)\ < max \(y — Ad(A),A u) )\\v\\i 



\(Au,Av)\ < HA'Aulloollulli < ep'Allooll^Hi (57) 
And || v ||i = ||«to||i since v is supported on (TU A) c C T c . Then what we need to prove is 

[7- max |(y-Ad(A),A a ,)|-e||A , A|| 0O ]||i;||i >0 (58) 
Since we can select e > as small as possible, then we just need to show 

7- max \(y - Ad(A),A 0J )\ > (59) 

w^TUA 

Since y - Ad(A) = (y - Ac(A)) + A(c(A) - d(A)), and by Lemma 1 we know A(c(A) - d(A)) = 
7MAa(Aa'MAa) _1 <M and since ||<M||oo < 1> we conclude that d(A) is the unique global minimizer 
if 

\\A(TuAy(v-M*))\\o° <7[1- max ||P(A)A A / MA a) ||i] (60) 
Next, we will show that d(A) is also the unique global minimizer under the following condition 

||A (TuA /(y-Ac TjA (A))|| 0O =7[l- max ||P(A)A A / MA aJ ||i] (61) 
Since the perturbation h ^ 0, then u / or v 7^ 0. Therefore, we will discuss the following three cases. 
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1) m / 0. In this case, we know L(d(A) + u) — L(d(A)) > since d(A) is the unique minimizer 
over all vectors supported on T U A. Therefore, L(d(A) + h) - L(d(A)) > if (ED holds. 

2) u = 0, v 7^ and v is not in the null space of A, i.e., Av ^ 0. In this case, we know ||Au||2 > 0. 
Hence, L(d(A) + h) - L(d(A)) > when fED holds. 

3) u = 0, v / and Av = 0. In this case, L(d(A) + h)- L(d(A)) = i\\v T c ||i. Thus, L(d(A) + h) - 
L(d(A)) > if 7 > 0. Clearly, L(d(A) + h) - L(d(A)) > when dSB holds. 

Finally, combining (l60l ) and (IdTT ). we can conclude that d(A) is the unique global minimizer if the 
following condition holds 



VuA)<='(2/ - -Ac(A))||oo < 7 ERC(A) 



(62) 



4J Proof of Lemma 4: Consider a A C A such that has full rank. Since A Tu ^'y = A Tu ^' ' {A Tu ^x Tu ^+ 
w + ^axA^AXa)' ex P an ding these terms we have 



Ox 



^TUA V ~ Q(^) X TUA ^ 
Then, using this in the expression for c(A) from (|25T ). we get 

AQ(A)- 1 



+ ^TUA w + ^TUA ^A\A X A\A 



(63) 



[c(A)] TuA - x TuA 



Ox 



+ 



Q(A)-M TuA ' 
°A\A 



+ 



U A\A 

Q(A) 1 ^4'ruA / ^A\A :r A\A 

_X A\A 



(64) 



Therefore, we get d47l >. 
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Supplementary Material 

E. Sufficient Conditions' Comparison** 

We briefly compare the results for reg-mod-BPDN, mod-BPDN and BPDN, primarily by comparing 
the sufficient conditions required for them to hold. The comparison of the bounds is not easy since each 
holds under a different set of sufficient conditions. This will be done later using the results of Section 
IV which hold without any sufficient conditions. For the comparison of sufficient conditions, we use the 
restricted isometry constant (RIC), 5s and restricted orthogonality constant (ROC), 6s,S' lfl3l defined next. 
These depend only on the sizes of the sets T, A and N and hence make a theoretical comparison easier. 
However the comparison can only be qualitative. The RIC and ROC are not computable (computation 
complexity is exponential in the set size) and hence cannot be used for numerical comparisons. On the 
other hand, the ERC and the bounds obtained based on the ERC approach are computable and can be 
used for a quantitative numerical comparison. We do this comparison later in Sec. V. 

1) RIC and ROC definition: The S restricted isometry constant (RIC) |[T3l . 5s, for a matrix, A, is 
defined as the smallest positive number satisfying 

(l-5s)\\4l<\\A T c\\l<(l + 6 s )\\4l (65) 

for all subsets T of cardinality \T\ < S and all real vectors c of length \T\. The restricted orthogonality 
constant (ROC) lfl"3l . Os u s 2 > * s defined as the smallest real number satisfying 

\ci A Tl ' At 2 c 2 \ < Os 1 ,s 2 \\ci\\ 2 \\c 2 \\ 2 (66) 

for all disjoint sets Ti,T 2 with |Ti| < Si, \T 2 \ < S2 and S\ + S2 < m, and for all nonzero vectors c\, 
c 2 of length \Ti\, \T 2 \ respectively. If we let C2 = At 2 ' At x ci, then (l66l ) implies that H^^'^TiCiHl < 
^Si.Sallcilhll^W^TiCil^ and so H^Ta'-^TiCil^ < 0Si,5 2 ||ci||2 for all nonzero ci's. Thus, using the 
definition of the matrix norm ||M|| 2 , this implies that 

\\A T2 'A Tl \\ 2 <9 Sl ,s 2 (67) 

for all sets with |Ti| < Si, \T 2 \ < S 2 . 

2) Sufficient Conditions' Comparison: Consider mod-BPDN versus BPDN first. Let us compare their 
ERC's. Using (|67i \\{A T 'A T + XIt)' 1 ]]'! < 1/(1 - S\ T \ + A) and the fact that for a vector z of length 

I, \\z\\i < \ft\\z\\ 2 , 

ERC T , X (A) > 1 - y/\A\\\Pr tX (A)h\\A A 'MT,xA 



w||2 

>l-V|A| (g|A| ' 1+ ^' +A) (68) 

where the numerator of the second term comes from bounding H^a'-^t aAj||2 an d the denominator of 
the second term comes from bounding ||Pt^(A)||2. In practice, for example in recursive reconstruction 
applications like real-time dynamic MRI, usually |A| w |A e | < \N\ and \N\ m \T\ « \T U A| |[25l . 
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Under this assumption, when fewer measurements are available (but still enough to ensure that S\m < 1), 
the denominator for the second term of ERC$q(N) (BPDN), 1 — 8\m, will be smaller than that of 
ERC Tfi (A) (mod-BPDN), 1 - 6\ A \ - j^^. Also, ypV| in its numerator will be larger than y/\A~\ for 
mod-BPDN, while the other numerator terms will be similar in both cases. This can result in a smaller 
(and possibly negative) lower bound on the ERC for BPDN. 

To compare reg-mod-BPDN and mod-BPDN, notice that mod-BPDN needs AtuA to be full rank 
where as reg-mod-BPDN only needs Aa to be full rank which is much weaker. 

We show a numerical comparison in Table I (simulation details given in Sec. V). Notice that BPDN 
needs 90% measurements for its ERC to become positive where as mod-BPDN only needs 19%. Moreover 
even with 90% measurements, its ERC is just positive and very small. As a result its error bound is large 
(27% normalized mean squared error (NMSE)). Similarly notice that mod-BPDN needs n > 19% while 
for reg-mod-BPDN n = 13% also suffices. 

Remark 4: A sufficient conditions' comparison only provides a comparison of when a given result can 
be applied to provide a bound on the reconstruction error. For example, in simulations, of course BPDN 
provides a good reconstruction using much lesser than 90% measurements. However, when n < 90% we 
cannot bound its reconstruction error using Theorem 1 above (for BPDN this is the same as the result 
of 0). 

F. Equivalence between Theorem 2 and Theorem 3 bounds 

We can use the weak law of large numbers (WLLN) to argue that as n, s = \N\ approach to infinity 
the bound from Theorem 3 converges to that of Theorem 2 in probability. We give the basic idea here. 
The complete proof will be in future work. The WLLN argument applies when 

• Each element of A is iid with zero mean and variance 1/n, i.e. A = -k=Z where each element of 

' s/n 

Z is iid with zero mean and unit variance. 

• The noise w is bounded in £2 norm, i.e. ||io||2 < i] and 

• n, s — > 00 

WLLN can be used to argue that as n, s — > 00, with high probability (w.h.p.), ERC (A) and the multipliers 
5i > 92, 93, 94 depend only on the size, k, of the set A, i.e. they are the same for all sets A of a given size. 
Thus, the only term in g(A) that varies for different sets A G Q\. is ll^^lb- Thus argming fc g(A) = 
argming fc ||x A v ^||2- Since ERC also only depends on k, for a given k, either ERC{k) > or ERC{k) < 
0. When ERC(k) > 0, Q k = {A C A,|A| = k}, where as when ERC(k) < 0, Q k is empty. The 
minimum value over an empty set is infinity. Thus, ming fc ||x A ^||2 = B k . Using (|4TT >. (T37T ) and (1401 . 
this means that <?(A*) = g(A**), i.e. the bounds from Theorems 2 and 3 are equal. 
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The WLLN argument is as follows. Note that all terms in g\ , g2 , 9z > 94 an d ERC that depend on A 
are functions of either A A 'A A or At' A^ or A A ' Mt : \A a or A' rW Consider A^'A A . 



Clearly E[Z? r ] = 1 and its variance, Var[Z? r ] = 3 where as E[Z i;r Z jir \ = while Var[Z i>r Z^ r ] = 1. 
Here, E[-] and Varf-] denote the expectation and variance computed over the distribution of A. Thus 
by WLLN, as n — > oo, A A 'A^ approaches the identity matrix, w.h.p.. A similar argument can be 
made for each element of At' A a to show that this approaches the zero matrix as n — > oo. A similar 
argument can also be made for Mt,x when s := \N\ (and hence \T\) goes to infinity to show that all its 
diagonal elements converge to one value and all the non-diagonal ones converge to another value. This 
fact can then be used to make a WLLN argument for each element of A A ' Mt,\A a . Now consider g^ 
which contains the term ||-A(tuA) c 'HIoo- Notice that {A^ T uA) c ' w )i = Y^j=i w j-^j,i- Taking expectations 
only over the elements of A, E[(A( T uA)-'w)i] = and Var[(A(r uA y'w)i] = Y%=x w jn - Thus 
by WLLN, each element of the vector A(ruA) a ' w approaches zero, and hence its infinity norm also 
approaches zero w.h.p.. Thus, w.h.p., for a given size k, all these three matrices and H^ruA^'^lloc^ and 
as a result all of ERC,gi,g2,g3,g4, converge to a value that does not depend on the set A. 
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